Skip to content

Conversation

@kaxil
Copy link
Member

@kaxil kaxil commented Sep 24, 2025

We encountered an intermittent test failure on Python 3.12 only in the Task SDK: https://github.com/apache/airflow/actions/runs/17957802850/job/51079451184#step:9:696 and marked that test as XFAIL: 6d2aac6

DeprecationWarning: This process (pid=78) is multi-threaded, use of fork() may lead to deadlocks in the child.

Failed Test: `

task-sdk/tests/task_sdk/execution_time/test_supervisor.py::TestWatchedSubprocess::test_reading_from_pipes

Example failure: https://github.com/apache/airflow/actions/runs/17957802850/job/51079451184#step:9:696

Root Cause Analysis

The Threading Issue

Python 3.12 introduced a DeprecationWarning when os.fork() is called in a multi-threaded process.

Code Path Investigation

Through systematic analysis, we discovered:

  1. Source: asgiref.sync.sync_to_async creates persistent ThreadPoolExecutor threads
  2. Location: task-sdk/src/airflow/sdk/execution_time/context.py:217
    conn = await sync_to_async(secrets_backend.get_connection)(conn_id)
  3. Flow:
    test_async_get_connection_from_api
      └── _async_get_connection()
          └── ensure_secrets_backend_loaded()
              └── sync_to_async(secrets_backend.get_connection)  # Creates ThreadPoolExecutor
                  └── ThreadPoolExecutor thread persists
                      └── Next test: test_reading_from_pipes
                          └── os.fork() triggers Python 3.12 warning
    

Testing Instructions for Reviewers:

# Test the specific problematic sequence
breeze run pytest task-sdk/tests/task_sdk/execution_time/test_context_cache.py::TestAsyncConnectionCache::test_async_get_connection_from_api task-sdk/tests/task_sdk/execution_time/test_supervisor.py::TestWatchedSubprocess::test_reading_from_pipes -v

# Test broader async test suite  
breeze run pytest task-sdk/tests/task_sdk/execution_time/ -k asyncio

Example run:

root@7bbad33290d7:/opt/airflow# pytest  task-sdk/tests/task_sdk/execution_time/test_context_cache.py::TestAsyncConnectionCache::test_async_get_connection_from_api   task-sdk/tests/task_sdk/execution_time/test_supervisor.py::TestWatchedSubprocess::test_reading_from_pipes
================================================================================================ test session starts =================================================================================================
platform linux -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- /usr/local/bin/python3
cachedir: .pytest_cache
rootdir: /opt/airflow/task-sdk
configfile: pyproject.toml
plugins: kgb-7.2, asyncio-1.2.0, cov-7.0.0, custom-exit-code-0.3.0, icdiff-0.9, instafail-0.5.0, mock-3.15.1, rerunfailures-16.0.1, timeouts-1.2.1, unordered-0.7.0, requests-mock-1.12.1, xdist-3.8.0, time-machine-2.19.0, anyio-4.11.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 2 items

task-sdk/tests/task_sdk/execution_time/test_context_cache.py::TestAsyncConnectionCache::test_async_get_connection_from_api PASSED                                                                              [ 50%]
task-sdk/tests/task_sdk/execution_time/test_supervisor.py::TestWatchedSubprocess::test_reading_from_pipes FAILED                                                                                               [100%]

====================================================================================================== FAILURES ======================================================================================================
___________________________________________________________________________________ TestWatchedSubprocess.test_reading_from_pipes ____________________________________________________________________________________
task-sdk/tests/task_sdk/execution_time/test_supervisor.py:274: in test_reading_from_pipes
    assert captured_logs == unordered(
E   assert [{'category': 'DeprecationWarning', 'event': 'This process (pid=3505) is multi-threaded, use of fork() may lead to dea...lsite', 'filename': '/opt/airflow/task-sdk/tests/task_sdk/execution_time/test_supervisor.py', 'level': 'warning', ...}] == [{'logger': 'task.stdout', 'event': "I'm a short message", 'level': 'info', 'timestamp': '2024-11-07T12:34:56.078901Z'... 249, 'logger': 'py.warnings', 'timestamp': datetime.datetime(2024, 11, 7, 12, 34, 56, 78901, tzinfo=Timezone('UTC'))}]
E     Extra items in the left sequence:
E     {'category': 'DeprecationWarning', 'event': 'This process (pid=3505) is multi-threaded, use of fork() may lead to dead...the child.', 'filename': '/opt/airflow/task-sdk/src/airflow/sdk/execution_time/supervisor.py', 'level': 'warning', ...}

I do not want to admit how much time I & @jedcunningham spent on debugging this frustratingly :)


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Resolves intermittent test failure where `asgiref.sync.sync_to_async` creates
ThreadPoolExecutors that persist between tests, causing Python 3.12+ to warn
about forking multi-threaded processes when supervisor.py calls `os.fork()`.

Added targeted cleanup of asgiref executors after async tests to ensure
clean thread environment for subsequent tests that use `fork()`.

Example failure: https://github.com/apache/airflow/actions/runs/17957802850/job/51079451184#step:9:696
@kaxil
Copy link
Member Author

kaxil commented Sep 24, 2025

Failures are due to docker rate limit:

ERROR providers/mongo/tests/unit/mongo/hooks/test_mongo.py::TestMongoHook::test_invalid_conn_type_srv - docker.errors.APIError: 500 Server Error for http+docker://localhost/v1.48/images/create?tag=latest&fromImage=mongo: Internal Server Error ("toomanyrequests: You have reached your unauthenticated pull rate limit. https://www.docker.com/increase-rate-limit")

Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a strange one! Was it introduced "now" due to some dependency bump or it was just a matter of fact till we eventually hit it?

@kaxil kaxil merged commit 8e6f03e into apache:main Sep 24, 2025
76 of 77 checks passed
@kaxil kaxil deleted the add-fix-async branch September 24, 2025 07:22
kaxil added a commit that referenced this pull request Sep 26, 2025
@kaxil kaxil added this to the Airflow 3.1.1 milestone Sep 26, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Sep 30, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 1, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 2, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 3, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 4, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 5, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 5, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 7, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 8, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 9, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 10, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 11, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 12, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 14, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 15, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 17, 2025
abdulrahman305 bot pushed a commit to abdulrahman305/airflow that referenced this pull request Oct 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants